Application-specific array processors for binary prefix sum computation
نویسندگان
چکیده
The main contribution of this work is to propose two application-specific bus architectures for computing the p r e f i sums of a binary sequence. Our architectures feature the following characteristics: (1) all broadcasts occur on buses of length I5 or 63; ( 2 ) we use a new technique t h a ~ we call shift switching which allows switches to cyclically permute an incoming signul, dramatically improving the performance of the reconfigurable bus system. As it turns out, our special-purpose architectures improve the performance of the best algorithm known to date by a significant factor. Specifically, our solutions require no adders, are faster, and use less VLSI area than the architectures of the state of the art.
منابع مشابه
Some Image Processing Algorithms on a RAP with Wider Bus Networks
Based on the reconfigurable array of processors with wider bus networks [8], we propose a series of algorithms for image processing. Conventionally, only one bus is connected between two processors but in this machine it has a set of buses. Such a characteristic increases the computation power of this machine greatly. Based on the base-m number system, we first introduce some basic operation al...
متن کاملHigh Performance Parallel Prefix Adders with Fast Carry Chain Logic
Binary adders are the basic and vital element in the circuit designs. Prefix adders are the most efficient binary adders for ASIC implementation. But these advantages are not suitable for FPGA implementation because of CLBs and routing constraints on FPGA. This paper presents different types of parallel prefix adders and compares them with the Simple Adder. The adders are designed using Verilog...
متن کاملImplementation and Performance Evaluation of Prefix Adders uing FPGAs
Parallel Prefix Adders have been established as the most efficient circuits for binary addition. The binary adder is the critical element in most digital circuit designs including digital signal processors and microprocessor data path units. The final carry is generated ahead to the generation of the sum which leads extensive research focused on reduction in circuit complexity and power consump...
متن کاملA High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure
The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...
متن کاملA High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure
The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...
متن کامل